Basics of ggplot2

Robert Schlegel

Problem

  • Default R plotting (Base R) hasn’t advanced much since the 90s
  • Non-intuitive syntax, functions, and arguments
  • Not enough control over final plot
  • Published figures do not look very professional

Solution

  • The ggplot2 package uses the grammar of graphics
  • Is integrated into the tidyverse
  • Easier syntax with intuitive functions and arguments
  • Massive range of well developed support and extensions

Setup

We will need the following two packages for the examples in these slides.

library(tidyverse) # Contains ggplot2

library(palmerpenguins) # Contains the dataset

Basics

  • One figure is made of one chunk of code
  • Starts with ggplot()
  • Each line is connected with a +
  • Add shapes with geom functions
    • e.g. geom_point() adds points
  • Plot skeleton created within aes()
  • Arguments assigned like normal functions (e.g. ggplot(data = penguins))

Basic plot

ggplot(data = penguins,
       aes(x = body_mass_g, y = bill_length_mm)) +
  geom_point(aes(colour = species))

Focus on aes()

  • Understand when to use aes()
  • Columns from the data go inside aes()
    • e.g. geom_point(aes(colour = species))
  • Static values go outside aes()
  • Mistakes with aes() are common when learning ggplot2
  • Good starting point when looking for errors

Inside aes()

ggplot(data = penguins,
       aes(x = body_mass_g, y = bill_length_mm)) +
  geom_point(aes(colour = island)) # 'island' is a column from 'penguins'

Outside aes()

ggplot(data = penguins,
       aes(x = body_mass_g, y = bill_length_mm)) +
  geom_point(colour = "red") # Make all points red

Outside inside?

ggplot(data = penguins,
       aes(x = body_mass_g, y = bill_length_mm)) +
  geom_point(aes(colour = "red")) # What is happening here

Inside outside?

ggplot(data = penguins,
       aes(x = body_mass_g, y = bill_length_mm)) +
  geom_point(colour = species) # Why does this cause an error?
Error in list2(na.rm = na.rm, ...): object 'species' not found

More than just colour

ggplot(data = penguins,
       aes(x = body_mass_g, y = bill_length_mm)) +
  geom_point(aes(size = flipper_length_mm, shape = island)) # What else can we add?

Add a geom

ggplot(data = penguins,
       # NB: Arguments passed to first aes() apply to all geoms
       aes(x = body_mass_g, y = bill_length_mm, colour = species)) +
  geom_point() +
  geom_smooth(method = "lm")

Change labels

ggplot(data = penguins,
       aes(x = body_mass_g, y = bill_length_mm, colour = species)) +
  geom_point() +
  geom_smooth(method = "lm") +
  labs(x = "Body mass (g)", y = "Bill length (mm)", colour = "Species") + # Change labels
  theme(legend.position = "bottom") # Change legend position